The introduction of the JavaScript programming language in 1995 changed the web world. Until then, web pages were limited to HTML and CSS technologies and were thus, static. The only way to generate content dynamically was to use appropriate technologies on the server-side. The use of JavaScript as a programming language that could run in the web browser changed that abruptly; it was the basis for what is now known as a “web application”: programs that were once reserved for the desktop, but which can now be run in the web browser. But there was a flaw: JavaScript could only run in the web browser, not on the server. Therefore, a second technology was always needed to map the server side. Over the years, various technologies had their heyday here. While PHP was initially the first choice, this changed increasingly with the advent of Java and .NET. Ruby and Python also played an increasingly important role in the development of web servers after the turn of the millennium. But no matter which language you used, you always had two languages: JavaScript in the client, another on the server. In the long run, this is impractical and error-prone, and it also makes development more difficult.
This is exactly what Node.js does away with. Node.js is a runtime environment for JavaScript that does not run in the web browser, but on the server. This makes it possible to use JavaScript for the development of the backend as well, so that the technological break that always existed until then is no longer necessary. Conveniently, however, Node.js is based on the same compiler for JavaScript as the Chrome web browser, namely V8 – and thus offers excellent support for modern language features. Meanwhile, Node.js, which was first introduced to the public in 2009, is over 10 years old and is supported by all major web and cloud providers. Unlike Java and .NET, for example, Node.js is not developed by a company but by a community, but this does not detract from its suitability for large and complex enterprise projects. On the contrary, the very fact that Node.js is under an open source license has now become an important factor for many companies when selecting a suitable base technology.
Installing Node.js
If you want to use Node.js, the first step is to install the runtime environment. In theory, you can compile Node.js yourself, but corresponding pre-compiled binary packages are also available for all common platforms. This means that Node.js can be used across platforms, including macOS, Linux, and Windows. However, Node.js can also be run on Raspberry Pi and other ARM-based platforms without any problems. Since the binary packages are only a few MB in size, the basic installation is done very quickly. There are several ways to install it. The most obvious is to use a suitable installer, which can be downloaded from the official website [1]. Although the installation is done with a few clicks, it is recommended to refrain from this for professional use. The reason is that the official installers do not allow side-by-side installation of different versions of Node.js. If one performs an update, the system-wide installed version of Node.js is replaced by a new version, which can lead to compatibility problems with already developed applications and modules. Therefore, it is better to rely on a tool like nvm [2], which allows side-by-side installation of different versions of Node.js and can manage them. However, nvm is only available for macOS and Linux. For Windows, there are ports or replicas, for example, nvm-windows [3], whose functionality is similar. In general, however, macOS and Linux are better off in the world of Node.js. Most tools and modules are primarily developed for these two platforms, and even though JavaScript code is theoretically not platform dependent, there are always little things that you fail at or struggle with on Windows. Although the situation has improved considerably in recent years due to Microsoft’s commitment in this area, the roots of the community are still noticeable. To install nvm, a simple command on the command line is enough:
$ curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.35.3/install.sh | bash
Afterwards it is necessary to restart the command line, otherwise nvm cannot find some environment variables. Simply closing and reopening the terminal is sufficient for this. Then the desired version of Node.js can be installed. For example, to install version 14.9.0, the following command is sufficient: $ nvm install 14.9.0
If necessary, the version number can also be shortened. For example, if you simply want to install the latest version from the 14.x series, you can omit the specific minor and release version: $ nvm install 14
All installed versions can be displayed in a list by entering the command $ nvm ls. To select and activate one of the installed versions, the following command is used, where again an abbreviated version number may be specified: $ nvm use 14.9.0
Often you want to set a specific version as default, for example, to work within a newly opened command line. For this nvm knows the default concept, where default serves as an alias for a specific version. For example, to specify that you normally always want to work with the latest installed version from the 14.x series, you must define the default alias as follows: $ nvm alias default 14
If you take a closer look at the Node.js website, you will notice that there are two versions available for download. On the one hand, there is a so-called LTS version, on the other hand, a current version. LTS stands for Long-Term Support, which means that this version of Node.js is provided with security updates and bug fixes for a particularly long time: However, particularly long in this context means only 30 months. Support for the current version, on the other hand, expires after just 12 months. It is therefore advisable to always rely on the LTS versions for productive use and to update them once a year – a new LTS version is always released in October according to the Node.js roadmap.
Hello world!
After the installation, we start Node.js. If you call it without any further parameters, it opens in an interactive mode where you can enter and evaluate JavaScript statements live. This is occasionally handy for quickly trying out a language construct but is hardly suitable for actual development. You can exit this mode by pressing CTRL + C twice. To develop applications, therefore, a different procedure is needed. First, you need any kind of editor or IDE, provided that the tool of choice can save plain text files with the .js extension. Node.js does not enforce that an application must have a specific name, though app.js has become common for the primary file. Occasionally, you may encounter other names, such as server.js or run.js, but app.js is used below. You can put any JavaScript code in such a file, for example a “hello world” program:
console.log('Hello world!');
To run this application, all you need to do is call Node.js and pass the filename as a parameter:
$ node app.js
Node.js translates the specified application into executable machine code using V8 and then starts execution. Since the program ends after outputting the string to the screen, Node.js also terminates execution, so you return to the command line. However, a pure console program is still not very impressive. It gets much more interesting when you use Node.js to develop your first small web server. To do this, you need to make use of a module that is built into Node.js out of the box, namely the http module. Unlike .NET and Java, Node.js does not contain a class or function library with hundreds of thousands of classes and functions. Instead, Node.js limits itself to the absolute essentials. The philosophy behind it is that everything else can be provided via third-party modules from the community. This may seem unusual at first glance, but it keeps the core of Node.js incredibly lean and lightweight. The http module is one of the few modules built into Node.js out of the box. Others can be found in the documentation [4]. To load a module, you have to import it using the built-in require function. This behaves similarly to use in C# or import in Java, yet there is one serious difference: unlike the aforementioned statements, the require function returns a result, namely a reference to the module to be loaded. This reference must be stored in a variable, otherwise the module cannot be accessed. Therefore, the first line of the Node.js application is as follows:
const http = require('http');
Then you can use the createServer function of the http module to create a server. It is important to make sure that you pass it a function as a parameter that can react to incoming requests and send back a corresponding response. This function is thus called again for each incoming request and can generate an individual result in each case. In the simplest case it always returns the same text. The function res.write is used for this purpose. Afterwards it is necessary to close the connection. This is done with the function res.end. The call to createServer in turn also returns a reference, but this time to the created web server:
const server = http.createServer((req, res) => { res.write('Hallo Welt!'); res.end(); });
Next, the web server must be bound to a port so that it can be reached from outside. This is done using the listen function, which is passed the desired port as a parameter:
server.listen(3000);
Last but not least, it is advisable to get into the habit from the very beginning of providing every .js file with strict mode, a special JavaScript execution mode in which some dangerous language constructs are not allowed, for example, the use of global variables. To enable the mode, you need to insert the appropriate string at the beginning of a file as a kind of statement. This makes the full contents of the app.js file look like the one shown in Listing 1.
'use strict'; const http = require('http'); const server = http.createServer((req, res) => { res.write('Hallo Welt!'); res.end(); }); server.listen(3000);
If you now start this application again, you can access it from the web browser by calling the address http://localhost:3000. In fact, you can also append arbitrary paths to the URL: Since the program does not provide any special handling for paths, HTTP verbs, or anything else, it always responds in an identical way. If one is actually interested in the path or, say, the HTTP verb, one can access these values via the req parameter. The program shown in Listing 2 outputs both values, so it produces output à la GET /.
'use strict'; const http = require('http'); const server = http.createServer((req, res) => { res.write(`${req.method} ${req.url}`); res.end(); }); server.listen(3000);
In addition to the http module, there are several other built-in modules, for example for accessing the file system (fs), for handling paths (path) or for TCP (net). Node.js also offers support for HTTPS (https) and HTTP/2 (http2) out of the box. Nevertheless, for most tasks, you will have to rely on modules from the community.
Include modules from third parties
Modules developed by the community can be found in a central and publicly accessible registry on the internet, the so-called npm registry. npm is also the name of a command line tool that acts as a package manager for Node.js and is included in the installation scope of Node.js. This means that npm can basically be invoked in the same way as Node.js itself. A simple example of a module from the community is the processenv module [5], which provides access to environment variables. This is also possible using Node.js’ on-board means, but then you always get the values of the environment variables as strings, even if the value is a number or a logical value, for example. The processenv module, on the other hand, converts the values appropriately so that you automatically get the desired value.
Before you can install a third party module, you first have to extend your own application with the file package.json. This file contains metadata about the application. Only a name and a version number are mandatory, which is why the minimum content of this file has the following form:
{ "name": "my-http-server", "version": "0.0.1" }
It should be noted that the version number must always consist of three parts and follow the concept of semantic versioning [6]. In addition, however, dependencies can also be stored in this file, whereby required third-party modules are explicitly added. This makes it much easier to restore a certain state later or to get an overview of which third-party modules an application depends on. To install a module, call npm as follows:
$ npm install processenv
This extends the package.json file with the dependencies section, where the dependency is entered as follows:
{ "name": "my-http-server", "version": "0.0.1", "dependencies": { "processenv": "^3.0.2" } }
Also, npm downloads the module from the npm registry and copies it locally to a directory named node_modules. It is recommended that you exclude this directory from version control. If you delete it or retrieve the code for your application from the version control system, which does not include the directory, you can easily restore its contents:
$ npm install
The specification of the desired modules can now be omitted, after all, they can be found together with the version number in the package.json file. A conspicuous feature of this file is the “roof” in front of the version number of processenv. It has the effect that npm install does not necessarily install exactly version 3.0.2, but possibly also a newer version, if it is compatible. However, this mechanism is dangerous, so it is advisable to consistently remove the roof from the package.json file. To avoid having to do this over and over again by hand, npm can alternatively be configured to not write the roof at all. To do this, create a file named .npmrc in the user’s home directory and store the following content there:
save-exact=true
And finally, in addition to the node_modules directory, npm has also created a file called package-lock.json. It is actually used to lock version numbers despite the roof being specified. However, it has its quirks, so if npm behaves strangely, it’s often a good idea to delete this file and the node_modules directory and run npm install again from scratch. Once a module has been installed via npm, it can be loaded in the same way as a module built into Node.js. In that case, Node.js recognizes that it is not a built-in module and loads the appropriate code from the node_modules directory:
const processenv = require('processenv');
Then you can use the module. In this example application, it would be conceivable to read the desired port from an environment variable. However, if this variable is not set, specifying a port as a fallback is still a good idea (Listing 3).
'use strict'; const http = require('http'); const processenv = require('processenv'); const port = processenv('PORT', 3000); const server = http.createServer((req, res) => { res.write(`${req.method} ${req.url}`); res.end(); }); server.listen(port);
Structure the application
As applications grow larger, it is not advisable to put all the code in a single file. Instead, it is necessary to structure the application into files and directories. This is already possible even in the case of the program, which is still very manageable, because you could separate the actual application logic from the server. In order to illustrate this, however, an intermediate step is introduced first: The function that contains the application logic is swapped out into its own function (Listing 4).
'use strict'; const http = require('http'); const processenv = require('processenv'); const port = processenv('PORT', 3000); const app = function (req, res) { res.write(`${req.method} ${req.url}`); res.end(); }; const server = http.createServer(app); server.listen(port);
In fact, it would also be conceivable to wrap this function in a function again in order to be able to configure it. Instead of the app function, you would then get a getApp function. The outer function can then be equipped with any parameters that the inner function can access. The signature of the inner function must not be changed, because it is predefined by Node.js through createServer:
const getApp = function () { const app = function (req, res) { res.write(`${req.method} ${req.url}`); res.end(); }; return app; };
However, this also means that you have to adjust the call to createServer accordingly:
const server = http.createServer(getApp());
Now the application is prepared to be split into different files. The getApp function is to be placed in its own file called getApp.js. Since the definition of the function is then missing in the app.js file, it must be loaded there, which – unsurprisingly – is again done using the require function. However, a relative or an absolute path must now be specified so that the require function can distinguish files to be reloaded from modules with the same name. The file extension .js can, but does not have to be specified (Listing 5).
'use strict'; const getApp = require('./getApp'); const http = require('http'); const processenv = require('processenv'); const port = processenv('PORT', 3000); const server = http.createServer(getApp()); server.listen(port);
If you now try to start the application in the usual way, you get an error message. This is because Node.js considers everything defined inside a file as private – unless you explicitly export it. Therefore, it tries to import the content of the file getApp.js, but nothing is exported from there. The remedy is to assign the getApp function to the module.exports object (Listing 6).
'use strict'; const getApp = function () { const app = function (req, res) { res.write(`${req.method} ${req.url}`); res.end(); }; return app; }; module.exports = getApp;
Whatever a file exports this way will be imported again by require: So if you export a function, you get a function afterwards; if you export an object, you get an object, and so on.
If you start the application again, it runs as before. The only unpleasant thing is the directory structure since the main directory becomes increasingly full. It is obvious that with even more files it quickly becomes confusing. The position of the files package.json and package-lock.json is predefined, as well as the position of the node_modules directory, and also the file app.js is well placed on the top level. However, any further code placed here will be disruptive:
/ node_modules/ app.js getApp.js package.json package-lock.json
Therefore, many projects introduce a directory called lib, which does not contain the main executable file of the application, but any other code. Adapting the directory structure for this project results in the following structure:
/ lib/ getApp.js node_modules/ app.js package.json package-lock.json
But now the import in the file app.js does not fit anymore, because the file getApp.js is still searched in the same directory as the file app.js. So it is necessary to adjust the parameter of require:
const getApp = require('./lib/getApp');
As you can see, this way it is quite easy to structure code in Node.js. Directories take over the role of namespaces. There is no further subdivision of this kind. The next step is to add more functionality to the application, which means writing more code and including more third-party modules from npm. One of the biggest changes when you start working with Node. js is the multitude of npm modules that you come into contact with over time, even on small projects. The idea behind this is that, in terms of complexity, it is more beneficial to maintain many small building blocks whose power comes from their flexible combinability than to use a few large chunks.
Outlook
This concludes the first part of this series on Node.js. Now that the basics are in place, the next part will look at writing web APIs. This will include topics like routing, processing JSON as input and output, validating data, streaming, and the like.
The author’s company, the native web GmbH, offers a free german video course with close to 30 hours of playtime on Node.js [7]. The first three episodes deal with the topics covered in this article, such as installing, getting started, and using npm and modules. Therefore, this course is recommended for anyone interested in more details.
Links & Literature
[2] https://github.com/nvm-sh/nvm
[3] https://github.com/coreybutler/nvm-windows
[4] https://nodejs.org/dist/latest-v14.x/docs/api/